In this analysis different portfolios consisting of stocks that are among the top 10 constituents of the MSCI Spain Index are backtested using the functionality of the portvine package. First the most important packages are loaded and after that the data will be imported and discussed shortly.
library(portvine)
library(tidyverse)
# utility color vector for visualizations
yes_no_cols <- c("#92B8DE", "#db4f59")
theme_set(
theme_minimal() +
theme(plot.title = ggtext::element_markdown(size = 11),
plot.subtitle = ggtext::element_markdown(size = 9))
)
# load the data
load(here::here("data", "msci_spain_data_clean.RData"))
glimpse(msci_spain_data)
## Rows: 3,089
## Columns: 11
## $ date <dttm> 2010-01-05, 2010-01-06, 2010-01-07, 2010-01-08, 2010~
## $ msci_spain_index <dbl> 7.227127e-03, -3.519919e-03, -3.271415e-03, -7.783202~
## $ iberdrola <dbl> 0.0011983863, -0.0013316290, -0.0064166196, -0.001046~
## $ banco_santander <dbl> 0.010994558, 0.007111767, -0.004592700, 0.003339731, ~
## $ inditex <dbl> -0.0066202730, 0.0004579803, -0.0147604156, 0.0034790~
## $ cellnex_telecom <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N~
## $ repsol_ypf <dbl> 0.0029173273, -0.0045062036, 0.0005267713, -0.0010632~
## $ ferrovial <dbl> 0.013137950, 0.032525280, -0.005260778, 0.003207450, ~
## $ amadeus_it_group <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N~
## $ telefonica <dbl> -0.0015128243, -0.0073567397, -0.0099731076, -0.01763~
## $ bbv_argentaria <dbl> 0.0069158977, 0.0038205007, -0.0045843707, 0.00763226~
One can see that there are 11 columns. The date column gives obviously the date as daily return data will be analyzed here. The daily log returns of the overall index are given in the column msci_spain_index and all other columns contain the daily log returns of the respective stock. One can also detect missing values which can be nicely observed in the visualization below.
This makes total sense as for example Cellnex Telecom stocks are traded since 2015, so there can not be earlier return data. So one could now either impute the values in order to use the whole time frame but here the more simplistic approach of just looking at a time frame without any missing values is applied. This reduced data set is available as msci_spain_complete_data and contains only complete cases.
summary(msci_spain_complete_data$date)
## Min. 1st Qu. Median
## "2015-05-08 00:00:00" "2016-12-21 18:00:00" "2018-08-07 12:00:00"
## Mean 3rd Qu. Max.
## "2018-08-07 16:52:04" "2020-03-24 06:00:00" "2021-11-08 00:00:00"
nrow(msci_spain_complete_data)
## [1] 1696
This means that the time frame is from the 8th Mai of 2015 until the 8th November of 2021 and there are 1696 observations. Below one can have a look at daily log returns of the overall MSCI Spain index.
So one can observe greater volatility during the stock market selloffs 2015-2016 maybe due to Chinese stock market turbulences, the EU dept crisis and the Brexit votum as well as for the first ‘Covid-19 year’ 2020. This should give the opportunity to access how robust the risk measure estimates are during a crisis. Now specify the first portfolio and analyze it unconditionally as well as conditionally.
misslHere one picks as the stock portfolio equally weighted all the stocks without the one with the highest market capitalization that makes roughly 17% of the MSCI Spain i.e. Iberdrola, that is mainly in the energy business. First one models this portfolio and estimates the risk estimates unconditionally and then conditionally on the biggest player Iberdrola.
missl portfolioFirst specify suitable marginal model settings as well as settings for the vine copula.
missl_marginal_settings <- marginal_settings(
train_size = 1000,
refit_size = 50
)
missl_vine_settings <- vine_settings(
train_size = 200,
refit_size = 25,
family_set = "all",
vine_type = "rvine"
)
To compare run times one estimates the risk once sequentially and once in parallel.
# sequentially
missl_risk_roll_seq <- estimate_risk_roll(
msci_spain_complete_data %>% select(-c(date, msci_spain_index, iberdrola)),
weights = NULL,
marginal_settings = missl_marginal_settings,
vine_settings = missl_vine_settings,
alpha = c(0.01, 0.05, 0.1),
risk_measures = c("VaR", "ES_mean", "ES_median", "ES_mc"),
n_samples = 1000,
n_mc_samples = 1000,
trace = TRUE
)
## The last window of interest is shorter (width: 46 ) than the specified window width of 50
##
## Fit marginal models:
## banco_santander inditex cellnex_telecom repsol_ypf ferrovial amadeus_it_group telefonica bbv_argentaria
##
## Fit vine copula models and estimate risk.
## Vine windows:
## (1/28) (2/28) (3/28) (4/28) (5/28) (6/28) (7/28) (8/28) (9/28) (10/28) (11/28) (12/28) (13/28) (14/28) (15/28) (16/28) (17/28) (18/28) (19/28) (20/28) (21/28) (22/28) (23/28) (24/28) (25/28) (26/28) (27/28) (28/28)
# 4 parallel R session
future::plan("multisession", workers = 4)
missl_risk_roll <- estimate_risk_roll(
msci_spain_complete_data %>% select(-c(date, msci_spain_index, iberdrola)),
weights = NULL,
marginal_settings = missl_marginal_settings,
vine_settings = missl_vine_settings,
alpha = c(0.01, 0.05, 0.1),
risk_measures = c("VaR", "ES_mean", "ES_median", "ES_mc"),
n_samples = 1000,
n_mc_samples = 1000,
trace = TRUE
)
## The last window of interest is shorter (width: 46 ) than the specified window width of 50
##
## Fit marginal models:
## banco_santander inditex cellnex_telecom repsol_ypf ferrovial amadeus_it_group telefonica bbv_argentaria
##
## Fit vine copula models and estimate risk.
## Vine windows:
## (1/28) (2/28) (3/28) (4/28) (5/28) (6/28) (7/28) (8/28) (9/28) (10/28) (11/28) (12/28) (13/28) (14/28) (15/28) (16/28) (17/28) (18/28) (19/28) (20/28) (21/28) (22/28) (23/28) (24/28) (25/28) (26/28) (27/28) (28/28)
future::plan("sequential")
The results and their respective run times can be seen below.
missl_risk_roll_seq
## An object of class <portvine_roll>
## Number of ARMA-GARCH/ marginal windows: 14
## Number of vine windows: 28
## Risk measures estimated: VaR ES_mean ES_median ES_mc
## Alpha levels used: 0.01 0.05 0.1
##
## Time taken: 6.6659 minutes
missl_risk_roll
## An object of class <portvine_roll>
## Number of ARMA-GARCH/ marginal windows: 14
## Number of vine windows: 28
## Risk measures estimated: VaR ES_mean ES_median ES_mc
## Alpha levels used: 0.01 0.05 0.1
##
## Time taken: 2.4207 minutes
# thus the speedup is already for 4 workers
missl_risk_roll_seq@time_taken / missl_risk_roll@time_taken
## [1] 2.75366
# besides the print method one can also call the summary for more details
summary(missl_risk_roll)
## An object of class <portvine_roll>
##
## --- Marginal models ---
## Number of ARMA-GARCH/ marginal windows: 14
## Train size: 1000
## Refit size: 50
##
## --- Vine copula models ---
## Number of vine windows: 28
## Train size: 200
## Refit size: 25
## Vine copula type: rvine
## Vine family set: all
##
## --- Risk estimation ---
## Risk measures estimated: VaR ES_mean ES_median ES_mc
## Alpha levels used: 0.01 0.05 0.1
## Number of estimated risk measures: 8352
## Number of samples for each risk estimation: 1000
##
## Time taken: 2.4207 minutes.
Now one can use the accessor functions to extract the fitted marginal as well as the vine copula models.
### get the fitted marginal models
missl_fitted_marginals <- fitted_marginals(missl_risk_roll)
# for each asset one now can access the uGARCHroll object that for example
# contains the fitted coefficients for the second rolling window
missl_fitted_marginals$banco_santander@model$coef[[2]]$coef
## Estimate Std. Error t value Pr(>|t|)
## mu -3.958998e-04 5.326216e-04 -0.7433042 0.45729755
## ar1 4.061090e-01 2.231842e-01 1.8196139 0.06881782
## ma1 -3.838058e-01 2.235111e-01 -1.7171666 0.08594874
## omega 6.565105e-06 2.742517e-05 0.2393824 0.81080906
## alpha1 5.782711e-02 5.768279e-02 1.0025020 0.31610118
## beta1 9.225192e-01 4.234043e-02 21.7881395 0.00000000
## skew 9.708474e-01 3.740128e-02 25.9575985 0.00000000
## shape 6.337415e+00 3.180092e+00 1.9928400 0.04627897
### get the fitted vine copula models
missl_fitted_vines <- fitted_vines(missl_risk_roll)
# then one can for example have a look at how the fitted vine structure evolved
# over time by plotting the first tree for different rolloing windows
rvinecopulib:::plot.vinecop(missl_fitted_vines[[1]])
rvinecopulib:::plot.vinecop(missl_fitted_vines[[28]])
Now finally the risk estimates are going to be analyzed. First one compares the 3 different methods of estimating the Expected Shortfall.
Also one might compare the number of exceedances.
risk_estimates(missl_risk_roll,
risk_measures = c("ES_mean", "ES_median", "ES_mc"),
alpha = 0.05,
exceeded = TRUE) %>%
group_by(risk_measure) %>%
summarise(relative_exceedances = mean(exceeded))
Here actually we find an indication for the fact that the values that fell below the corresponding VaR where left skewed which leads to the fact that the mean will be a more conservative estimate in those cases which might explain the slightly lower exceedance rate.
risk_estimates(missl_risk_roll,
risk_measures = c("ES_mean", "VaR"),
alpha = 0.01) %>%
ggplot() +
geom_line(aes(x = row_num, y = realized), col = "lightgrey") +
geom_line(aes(x = row_num, y = risk_est,
col = risk_measure, linetype = risk_measure),
size = 0.5) +
labs(x = "estimation window", y = "portfolio log returns",
linetype = "Risk measure",
title = "Comparison of risk measure behaivior for alpha level 0.01") +
scale_color_manual(values = c(yes_no_cols[1], "#477042"),
name = "Risk measure") +
theme(axis.text.x = element_blank(),
axis.ticks.x = element_blank())
risk_estimates(missl_risk_roll,
risk_measures = c("VaR"),
alpha = 0.01,
exceeded = TRUE) %>%
ggplot() +
geom_line(aes(x = row_num, y = realized), col = "lightgrey") +
geom_point(aes(x = row_num, y = realized, col = exceeded)) +
geom_line(aes(x = row_num, y = risk_est), col = yes_no_cols[1]) +
scale_color_manual(values = c("lightgrey", yes_no_cols[2])) +
labs(x = "estimation window",
y = "portfolio log returns",
col = "Exceeded",
subtitle = paste0("Exceedances are highlighted in ",
"<span style='color:",
yes_no_cols[2],
"'>**red**</span>",
"."),
title = "Risk measure: VaR with alpha level 0.01"
) +
theme(
axis.text.x = element_blank(),
axis.ticks.x = element_blank(),
legend.position = "none")
risk_estimates(missl_risk_roll,
risk_measures = c("ES_mean"),
alpha = 0.01,
exceeded = TRUE) %>%
ggplot() +
geom_line(aes(x = row_num, y = realized), col = "lightgrey") +
geom_point(aes(x = row_num, y = realized, col = exceeded)) +
geom_line(aes(x = row_num, y = risk_est), col = yes_no_cols[1]) +
scale_color_manual(values = c("lightgrey", yes_no_cols[2])) +
labs(x = "estimation window",
y = "portfolio log returns",
col = "Exceeded",
subtitle = paste0("Exceedances are highlighted in ",
"<span style='color:",
yes_no_cols[2],
"'>**red**</span>",
"."),
title = "Risk measure: ES (mean estimation) with alpha level 0.01"
) +
theme(
axis.text.x = element_blank(),
axis.ticks.x = element_blank(),
legend.position = "none")
missl portfolioEstimation is performed in the same fashion as above with some additional arguments concerning the conditioning on the stock Iberdrola.
missl_cond_marginal_settings <- marginal_settings(
train_size = 1000,
refit_size = 50
)
missl_cond_vine_settings <- vine_settings(
train_size = 200,
refit_size = 25,
family_set = "parametric",
vine_type = "dvine"
)
future::plan("multisession", workers = 6)
missl_cond_risk_roll <- estimate_risk_roll(
msci_spain_complete_data %>% select(-c(date, msci_spain_index)),
weights = NULL,
marginal_settings = missl_cond_marginal_settings,
vine_settings = missl_cond_vine_settings,
alpha = c(0.01, 0.05, 0.1),
risk_measures = c("VaR", "ES_mean", "ES_median", "ES_mc"),
n_samples = 1000,
n_mc_samples = 1000,
cond_vars = "iberdrola",
cond_alpha = c(0.05, 0.5),
trace = TRUE
)
## The last window of interest is shorter (width: 46 ) than the specified window width of 50
##
## Fit marginal models:
## iberdrola banco_santander inditex cellnex_telecom repsol_ypf ferrovial amadeus_it_group telefonica bbv_argentaria
##
## Fit vine copula models and estimate risk.
## Vine windows:
## (1/28) (2/28) (3/28) (4/28) (5/28) (6/28) (7/28) (8/28) (9/28) (10/28) (11/28) (12/28) (13/28) (14/28) (15/28) (16/28) (17/28) (18/28) (19/28) (20/28) (21/28) (22/28) (23/28) (24/28) (25/28) (26/28) (27/28) (28/28)
future::plan("sequential")
summary(missl_cond_risk_roll)
## An object of class <cond_portvine_roll>
##
## --- Conditional settings ---
## Conditional variable(s): iberdrola
## Number of conditional estimated risk measures: 16704
## Conditioning quantiles: 0.05 0.5
##
## --- Marginal models ---
## Number of ARMA-GARCH/ marginal windows: 14
## Train size: 1000
## Refit size: 50
##
## --- Vine copula models ---
## Number of vine windows: 28
## Train size: 200
## Refit size: 25
## Vine copula type: dvine
## Vine family set: parametric
##
## --- Risk estimation ---
## Risk measures estimated: VaR ES_mean ES_median ES_mc
## Alpha levels used: 0.01 0.05 0.1
## Number of estimated risk measures: 8352
## Number of samples for each risk estimation: 1000
##
## Time taken: 129.3178 minutes.